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Six parameters of importance in many communication systems are: (a) 
the rate at which digital information is transmitted; (b) the bandwidth of 
the system; (c) the signal power of the transmitted signals; (d) the noise 
power of disturbances in transmission; (e) the error probability in digits 
recovered at the receiver output; (/) the length of time that the transmitter 
and receiver can store their inputs. These six parameters cannot assume 
arbitrary values: certain sets of values cannot be realized. In a series of 
curves, this paper describes the boundary between compatible and incom- 
patible sets of parameter values. In the model studied, it is assumed that 
the disturbance is additive Gaussian noise with constant power density 
spectrum in the transmission band. 

I. INTRODUCTION 

In comparing the performance of communication systems that trans- 
mit information by means of signals of limited bandwidth, six quantities 
descriptive of the system and its environment are of particular impor- 
tance: (t) the rate at which the system transmits information; (it) the 
bandwidth occupied by the transmission signals; (Hi) a measure of the 
power of these signals; (iv) a measure of the ambient noise which per- 
turbs the transmitted signals; (v) the delay time (caused by the trans- 
mitter and receiver) between the introduction of information at the in- 
put of the system and the emergence of useful information at the output 
of the system; (vi) a measure of the fidelity with which the information 
at the output of the system represents the information presented to the 
input of the system. 

To compare the performance of two communication systems in a 
meaningful manner, it is usually necessary to consider the values of at 
least these six quantities for the two systems. In general, such a com- 
parison will not yield a simple ordering of the two systems. Two systems 
may utilize the same bandwidth, introduce the same delay, and operate 
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in the same noise environment. The first system may transmit informa- 
tion at a greater rate with somewhat better fidelity than the second, but 
may require much more signal power. Which system is to be judged bet- 
ter then depends on external considerations such as the economics of 
equipment and the purpose for which communication is being established. 
These external considerations allow the engineer to assign relative 
weights or costs to the six quantities in question. 

Quite apart from these costs dictated by external considerations that 
may vary with every conceivable usage of a communication system, it 
is clearly desirable to know, in the first place, what mutual values of the 
six quantities can ever be obtained by any means. In order to provide 
such quantitative information it is necessary to particularize both the 
model of the communication system and the six descriptive parameters. 

In all that follows we shall assume that a discrete message source 
presents independent equiprobable decimal digits for transmission at 
the uniform rate R decimal digits (or dits) per second. (The output of 
any other discrete source having entropy rate R can be encoded into 
this form.) A transmitter operates on these decimal digits to produce a 
continuous signal of average power S lying in the frequency band (0,W) 
cycles/second. The signal produced by the transmitter is perturbed by 
the addition of independent Gaussian noise of total power N and constant 
power spectral density N/W in the band (Q,W) cycles/second. A re- 
ceiver operates on the perturbed signal to produce decimal digits at an 
average rate R symbols/second. When the receiver output symbols and 
the transmitter input symbols are placed in proper correspondence, the 
average probability, P e , that an output symbol be different from the 
corresponding input symbol will be taken as the measure of fidelity 
with which the system operates. To perform their coding functions, the 
transmitter and receiver may each require the internal storage of T 
seconds of their inputs. We use the dimensionless parameter 

n = 2WT 

(that is, T measured in Nyquist intervals) as a measure of the delay or 
complexity of encoding associated with transmitter and receiver. 

Our concern henceforth is with the six quantities R, W, S, N, n, and 
P e of this model and with the determination of the boundaries of the 
region of compatible values for these parameters. The famous capacity 
formula of Shannon 1 published in 1948, C = W log (1 + S/N), provides 
information about this boundary when n — > oo, i.e., when arbitrarily 
complicated receiver and transmitter coding operations are allowed. The 
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astonishing fact that P e could be made arbitrarily small for certain 
finite nonzero values of R, W, and S/N by letting n — > <x> , promised the 
existence of most remarkable and previously unsuspected communica- 
tion systems. This led Gilbert 2 and others to compute the values of R, 
W, S/N, and I\ obtainable with specific transmitters and receivers 
having fixed delay n and to compare these results with Shannon's 
formula. The results were disappointing. For all systems examined, even 
those permitting quite complex encodings (n = 100), it was found that 
to achieve practical values of P e , S/N had to be at least db more than 
that given by the capacity formula. The question arose: was this result 
due to the comparative poorness of the specific systems chosen, or is 
the approach to the ideal systems described by the capacity formula 
inherently very slow with increasing nl For a fixed finite value of n, what 
values of R, W, S/N and P e are theoretically attainable? 

Some information on this subject for large values of n was given by 
Rice 3 as early as 1950. The question was answered in considerable detail 
by Shannon in an important paper 4 in which he presented a number of 
inequalities that permit rather accurate determination of the region of 
attainable parameter values for all values of n. Shannon's primary interest 
here was again in the case of large delay, and he developed asymptotic 
forms for his inequalities in this case. For small delay, the inequalities 
involve quite complicated expressions and their numerical evaluation is 
not a simple matter. 

The present paper describes in Appendix A a technique which, by 
means of an electronic computer, permits highly accurate evaluation of 
the quantities entering these inequalities. The technique has been used 
to map out bounds on the compatible region of the six quantities in 
question over a wide range of parameter values. The results of the com- 
putations are presented here in a number of curves which cross plot the 
quantities in various ways which we hope will be useful to the communi- 
cation engineer.* In particular, the curves show quantitatively the 
improvement in communication systems that can be achieved with a 
given degree of coding (measured by delay). Considerable improvement 
can be obtained with a small amount of encoding, but to approach within 
a few db of the capacity formula in general requires extremely compli- 
cated systems. The curves also give numerical information concerning 
the trade-offs of the various parameters. They should provide useful 
references of comparison for existing communication systems. 

* An application of these curves to the problem of determining the threshold 
in modulation systems that expand bandwidth is given in Ref. 5. 
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Fig. 1 — Relationship between signal parameters with arbitrarily complex en- 
coding. Solid curve gives y = 10 logio S/N vs R/W; dashed curve gives z = 10 
logio (SW/NR) vs R/W. 

II. IDEAL SYSTEMS — UNRESTRICTED CODING 

The solid curve on Fig. 1 shows a plot of the relation 

r = IF logio (1 + S/N) (1) 

in terms of the two dimensionless quantities 

r = R/W, y = 10 \og ]Q (S/N). 

This curve can be interpreted* as follows. For values of R, W, S and JV 
corresponding to points above the curve, transmission with arbitrarily 
small positive values of P e can be achieved by use of sufficiently com- 
plicated coding schemes (sufficiently large finite values of n). For values 
of R, W, S and N corresponding to points below the curve, I\ is bounded 
away from zero independently of n. For systems represented by these 
points, no amount of coding can make the error probability arbitrarily 
small. 



* There are many subtle and thorny points in the argument that permits one to 
apply the capacity formula to communication systems transmitting continuous 
signals. Some of these points are discussed in Appendix B. 
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In many communication situations, the quantity 

7 = &/K _ S / R 
R/W N/W 

is a useful system parameter. This quantity is the signal energy per dit 
divided by the noise power per unit bandwidth. From (1), 

Z = (10 r - l)/r. (2) 

The dashed curve of Fig. 1 shows a plot of 

z = 10 logio Z 

vs r as determined by (2). For a given value of r, values of z above the 
curve are attainable with arbitrarily small positive P e and finite delay; 
arbitrarily small positive values of P e cannot be obtained for z values 
below the curve with finite delay. 

The curves on Fig. 1 describe the relations between R, W, S and N 
along the intersection of the planes P e = 0, n = °o with the boundary 
of the region of mutual compatibility of the six parameters. The inter- 
section of any two other planes, say P e = Ci and n = c-i , with this 
boundary also determines a curve in the y-r or z-r plane. Unfortunately, 
the exact form of these curves is not known at present. 

III. FINITE n AND NONZERO P e 

To understand fully the assumptions implicit in the remaining curves 
to be presented here, it is necessary to recall the approach taken by 
Shannon in Refs. 4 and 6. 

Since the signal produced by the transmitter is limited in frequency to 
the band (0,ir) cycles/second, it can (according to the sampling theo- 
rem) be thought of as generated by the application of a train of impulses 
as input to an ideal low-pass filter with cutoff frequency W. The im- 
pulses are spaced l/(2W) seconds apart and are of varying amplitude. 
During a fixed time T,n = 2WT such impulses are applied to the filter. 
During this same time T, the information source can produce one of 
M = 10 Rr different messages. One method, then, of determining from 
the output of the information source the train of impulses to be applied 
to the filter is to provide a dictionary that lists for each of the possible 
M messages a corresponding sequence of n impulses. The transmitter 
examines the source output for T seconds and determines which of the 
M messages was produced. The dictionary is then consulted to obtain 
the corresponding sequence of n impulses. These impulses are applied 
at a uniform rate to the filter during the next T seconds. At the end of 
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this time, the source has produced another message from the list of M 
messages and the process is repeated. This method of encoding the 
source is known as block coding of length n. 

In a block coding scheme of length n, the average power of the signal 
produced at the output of the filter depends on the amplitudes of the 
impulses listed in the encoding dictionary. It is easy to show that each 
word of the dictionary, i.e., each sequence of n impulses, contributes an 
energy ct/2W to the transmitted signal. Here d 2 is the sum of the squares 
of the amplitudes of the n impulses in question. Since one word is trans- 
mitted every T seconds, one method of achieving average power S for 
the transmitted signal is to require that <f = nS for each word of the 
dictionary. We shall refer to dictionaries of this sort as equal energy 
block codes. 

In Ref. 4, Shannon presents explicit formulae for functions Q„(r,Y) 
and Q n (r,Y) which have the following significance. For the communica- 
tion model under discussion, there exist transmitters and receivers using 
equal energy block codes of length n such that 

P e ^ Q n (R/W,S/N). 

For every equal energy block code of length n, the system parameters 
satisfy the inequality 

1\ ^ Q n (R/W,S/N). 

Here P e is the probability that a transmitted word of the dictionary be 
decoded incorrectly. The functions Q n and Q n and their numerical 
evaluation are discussed further in Appendix A. 
Consider now a relationship such as 

Qm(R/W,S/N) = 10-* (3) 

which serves to determine S/N as a function of R/W. This relation could 
be plotted on Fig. 1 with S/N measured in db to yield a curve lying above 
the solid-line capacity curve shown there. For our purposes, the vertical 
difference between these two curves is of primary interest. This difference 
is shown by the bottom solid curve of Fig. 2. Explicitly, the bottom 
curve of Fig. 2 is a plot of 

y = 10 log 10 (S/N) - 10 log 10 (lO*"" - 1) 

vs R/W, where S/N is given in terms of R/W by (3). The bottom dashed 
curve of Fig. 2 is an analogous display of the relation defined by 

Q 10l (R/W,S/N) = 10- 4 . 
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Fig. 2 — Upper and lower bounds (dashed and solid curves, respectively) on 
.S'/.V needed to achieve word -error probability of 10 '* for various values of v = 
2WT. Circle, triangles, and crosses give performance of some known codes. 

The two bottom curves on Fig. 2 have the following significance. 
For a given value of R/W, there exist equal energy block codes of length 
101 that will achieve an error probability of P e = 10~ 4 with as small a 
value of S/N as that given by the ordinate of the dashed curve. On the 
other hand, every equal energy block code of length 101 that achieves 
an error probability of 10 -4 must operate with a value of S/N at least 
as large as the ordinate of the solid curve. The curves thus serve to 
bound the minimal signal-to-noise ratio with which an error probability 
of 10~ can be achieved when equal energy block codes of length 101 are 
employed. The bounds are plotted in db above the signal-to-noise ratio 
given by the capacity formula, and thus measure the penalty in signal- 
to-noise ratio that must be paid for restricting the coding (n = 101). 

The remaining curves on Fig. 2 give analogous results for n =■ 5 and 
n = 25. It is to be noted that the solid and dashed curves are much 
closer together for large n, than for small n. This effect is shown more 
clearly on Fig. 3, which was obtained from a cross plot of many curves 
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Fig. 3 — Crosa-sections of Figure 2 taken for R/W = 0.2 and O.fi. 

of the sort shown on Fig. 2. For n = 101, there is little practical differ- 
ence between the two bounds. For small values of n, however, the dis- 
parity is great, and the question naturally arises: does the solid curve, 
or the dashed curve, more nearly represent the minimal signal-to-noise 
ratio needed to obtain 1\ = 10~ 4 with an equal energy block code of 
fixed length n? 

We believe that the bound obtained from Q is quite close to the mini- 
mal attainable S/N even for small n. Indeed, for n = 5, we have been 
able to construct explicit equal energy block codes with a variety of 
rates whose parameters plot close to the top-most solid line of Fig. 2 
when S/N was adjusted to guarantee an error probability not greater 
than 10" 4 . The five right-most triangles in the figure locate the per- 
formance of certain block codes known as simplex codes [the (D,D -f 1) 
codes of Ref. 2]. The crosses locate the performance of certain new codes 
to be described in a later paper. The circle gives the performance of 5-bit 
PCM. The four left-most triangles locate the performance of some sim- 
plex codes of block length 25. Apart from these explicit examples that 
plot near the bounds obtained from Q, there are theoretical considera- 
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tions which show that Q is a very weak bound for small values of n. 
Henceforth, in this paper we shall deal only with bounds obtained from 
Q and shall treat the relationship 

Q n (R/W,S/N) = P e (4) 

as the denning equation of the boundary of the region of compatible 
values of R, W, S, N, P e and n for equal energy block codes. 

IV. DISCUSSION OF 11ESULTS 

Figs. 4, 5 and G give plots of S/N vs R/W as determined from (4) for 
various values of l\ and n. The ordinates here, as in Fig. 2, are given 
in db above capacity, i.e., in db above the solid curve of Fig. 1. One 
advantage of this representation is that the ordinates of Figs. 4, 5 and 6 
may also be interpreted as values of Z, the latter now being measured 
in db above the capacity value given by the dashed line of Fig. 1. 

From Figs. 4, 5 and 6, it is apparent that for a fixed rate and fixed 
error probability modest amounts of coding (small values of n) can 
produce a significant reduction in signal power, but that the return for 
increased encoding diminishes rapidly. This is seen more clearly from 
the cross plot given on Fig. 7. 

The improvement in performance that can be obtained by encoding 
can also be expressed in terms of decreased error probability for a fixed 
rate and signal-to-noise ratio as is shown in Fig. 8. 

An interesting feature of Figs. 4, 5 and 6 is the minimum value clearly 
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Fig. 4 — Minimum possible S/N to attain word -error probability of 10~ 2 for 
various values of R/W and n. 
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Fig. 5 — Minimum possible S/N to attain word-error probability of 10 -4 for 
various values of R/W and n. 
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Fig. 6 — Minimum possible S/N to attain word-error probability of 10 6 for 
various values of R/W and n. 
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Fig. 7 — Cross-plot of Figs. 4, 5 and G showing (for R/W = 0.2) decrease in 
S/N needed to achieve a given word-error probability as n is increased. 

evident on the n = 5 curves. It is not hard to show (see Appendix C) 
that for all values of n, the curves obtained from (4) as plotted on these 
figures must rise indefinitely with increasing R/W. For equal energy 
block codes, there is, for any fixed P e and n, a best value of R/W in the 
sense of minimizing the additional signal-to-noise ratio needed above 
that given by the channel capacity formula. When the curves of Figs. 4, 
5 and G are plotted on a graph such as Fig. 1 with absolute S/N as 
ordinate, the curves are monotone increasing but eventually for large 
R/W depart further and further above the capacity formula curve. This 
phenomenon is due to the restriction imposed here that all code words 
of the dictionary have the same energy, a restriction likely to be realized 
in practice. This point is discussed further in Appendix D. 

Another way of presenting (4) that shows the departure from the 
ideal system of the capacity formula that results with equal energy block 
codes of restricted length is shown in Fig. 9. Fix P c and n. Then from 
(4), a given value of r = R/W determines a corresponding signal-to- 
noise ratio, S/N . From the capacity formula, using this value of S/N 
it is possible to achieve any desired P e with a rate per bandwidth f = 
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Fig. 8 — Word-error probability vs n for given R/W and S/N above ideal for 
best possible equal-energy codes. 

logio ( 1 + S/N) by sufficiently complex encoding. The ratio r/f then 
measures the price paid in lost rate due to restricting the amount of 
encoding. The solid curves on Fig. 9 were obtained from Q and give 
upper bounds on r/f for equal energy block codes; the dashed curves 
derived from Q give lower bounds for this ratio. It can be shown (see 
Appendix C) that the solid curves approach (n — \)/n asymptotically 
with increasing R/W. 
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Fig. 9 — Upper and lower hounds (solid and dashed curves, respectively) on 
fractional loss in rate, r/f, due to finite encoding. Loss plotted vs R/W for fixed 
n and P, . 

Yet another way of viewing the bounds is given on Fig. 10. Here, for a 
fixed signal-to-noise ratio and a fixed error probability, the improvement 
in signaling rate that can be obtained by increasing the length of equal 
energy codes is shown. It is seen, for example, that even with signal to 
noise ratios as high as 20 db, one cannot achieve 75 per cent of the ideal 
rate with equal energy codes of length less than 15 when the prescribed 
error probability is 10~ . The S/N = °o curve is given by r/f = 
(n — \)/n. That this limiting curve is different from unity is again due 
to the fact that the bounds used here are those for equal energy codes. 
If restricted energy codes were used, (see Section V) the limiting curve 
corresponding to S/N = °o would be r/f = 1 . 



V. CONCLUDING REMARKS 

The exact computation of Q n that was carried out here allows one to 
test the range of validity of Shannon's asymptotic expressions for this 
quantity. On plots such as Figs. 4, 5 and 6, his formula* (4) of Ref. 4 
gives curves in very close agreement with those shown for n = 101. 
At n = 25 the error is about 0.1 db at large rates and 0.3 db at small 
rates. This formula was used to compute the curves for n = 500 and 
1000 shown on Fig. 5. Although it involves only elementary functions, 

* This formula contains a misprint. The printed version must be multiplied by 
— G to be corrected. 



694 



THE BELL SYSTEM TECHNICAL JOURNAL, MAY 1963 



0.8 



0.6 



0.2 



- f 










*— 

20 DB 
15 DB 






~\ // 










10 DB 
. 5 DB 


/ 












■ 










P e = 10" 6 


1 


1 











60 

n 



Fig. 10 — Rate loss of Fig. 9 plotted vs n for fixed S/N and P e . 

the formula is quite complicated, and for extensive computations ma- 
chine methods are desirable. For moderate or small values of n, exact 
values of Q can be obtained by the method of Appendix A with com- 
parable ease. Shannon's elementary asymptotic formula (73) of Ref. 4 
has also been evaluated. For n = 500 and 1000, it gives values that 
agree with the curves of Fig. 5 to about 0.1 db for R/W > 0.5. For small 
rates it gives values 0.5 db too large. The accuracy of the formula 
diminishes rapidly as n is decreased below 100. 

The bounds presented here were obtained for communication systems 
using equal energy block codes of fixed length. It is, of course, possible 
to signal using block codes that have words of differing energy. One code 
of this sort of particular interest that is treated by Shannon in Ref. 4, 
Section XIII is the restricted energy block code. In these codes, each word 
of the dictionary contributes energy ST or less to the transmitted signal, 
i.e., for each code word d 2 ^ nS. Note that for these codes S is no longer 
the average signal power, but rather the maximum contribution to the 
signal power by any code word. 

For any communication system with parameters R, W, S, N using a 
restricted energy block code of length n, Shannon showed that the aver- 
age error probability, P e ', for a decoded word is bounded below by 



Pe' ^ Qn+l 



( n R S\ 
\n + 1 W N) 



(5) 
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For any fixed value of R/W, as n becomes large this lower bound ap- 
proaches the one already given for equal energy block codes, and so 
asymptotically ( in n ) one can do no better with restricted energy codes 
than with equal energy codes. However, for any fixed value of n, as 
R/W becomes large the lower bounds for the two classes of codes be- 
have very differently, and indeed it is easy to argue that in this limit 
restricted energy codes are superior to equal energy codes. This point is 
discussed further in Appendix D. 

The solid curves of Fig. 11 are those already shown in Fig. 6. The 
dashed curves were obtained from the lower bound (5) for restricted 
energy block codes. These dashed curves approach the horizontal 
asymptotes indicated at the right. From the figure it is seen that for 
R/W < 0.6 and n ^ 25 the bounds for restricted energy codes differ 
from those for equal energy codes by less than 0.2 db. For small values 
of n, the dashed curves lie below the solid ones even for small rates. 

It should be pointed out in closing that the error probability P„ used 
throughout these calculations is the probability that a word of the block 
code be improperly identified when a maximum likelihood receiver is 
used. This is not in general the probability that an individual decoded 
decimal digit be in error but rather an upper bound to this quantity. 
For large n, a single code word is decoded into many decimal digits 
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Fig. 11 — Comparison of bounds for equal-energy codes and restricted-energy 
codes. 
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The received code word may be incorrectly identified and yet decoded 
into a block of decimal digits many of which are correct. When large 
block codes are used and P e is small, errors in the decoded stream of 
decimal digits are not distributed uniformly. Many successive groups 
of decimal digits, each containing RT digits, will be error free. Then a 
single block of RT digits will be produced that contains from one to 
RT erroneous digits. This bunching of errors may, in certain applica- 
tions, be a serious drawback to the use of block coding. 
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APPENDIX A 

Computation of Q and Q 

Our notation is similar to Shannon's 4 and we here adopt his geometrical 

point of view : 

S = signal power (each signal vector is of length \/nS) ; 
N = noise power (variance N in each dimension); 
A = VS/N = signal-to-noise "amplitude" ratio; 
n = number of dimensions; 
71/ = number of signal vectors; 
$2(0) = solid angle in n-space of a cone of half-angle 0, or area of unit 
?i-sphere cut out by the cone; _ 

Q(B) = probability of a point X in n-space, at distance A\/n from 
the origin, being moved outside a circular cone of half-angle with 
vertex at the origin and axis OX (the perturbation is assumed spherical 
Gaussian with unit variance in all dimensions) ; 

0! = angle such that il/fi( 0i ) = fi(ir). 
Shannon shows [his equation (20)] that 

Q(0i) ^ P c ^ QM - ;wU f ' Q(e)dQ(e), 

where P c is the error probability of the best equal energy M -vector code 
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in n-space used with signal-to-noise ratio A . We proceed to discuss the 
evaluation of these quantities. 

As shown by Shannon [his equation (21)] 

(n-l)/2 r e 



/„ -i \ (n-n/2 r e 

r (rr L ) 



(6) 



The .surface O(tt) of the unit n-sphere has area 



n'i + i 



A change of variable sin 2 £ = t shows that 

m = i LLiz rv^-Yi-*)*- 1 ^ 

fl(ir) 2 (1\ fn-V ' 



— Tt /sii|20 



).l V 2 

71 - 1 1 
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where I t ,(p,tj) is Pearson's incomplete beta function. 7 Thus B\ is given 

by 

" = W^A (7) 



.1/ 
The rate is related to 0, bv 



— =-log w M. (8) 



To evaluate Q(0), it is convenient to use n-dimensional cylindrical 
coordinates with origin located on the axis of the cone at a distance 

/ = \/nA 

from the vertex and within the cone. The z- or rotational axis of the 
coordinate system coincides with the axis of the cone and is oriented so 
that the vertex of the cone has ^-coordinate — I. Denote distance from 
the 2-axis by r. Then an element of "area" distant r from the axis and 
having radial dimension dr and axial dimension dz sweeps out volume 

(n - \W"~ x)l -r n ~ 2 drdz 



m 
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when rotated about the z-axis. One has therefore 



n-l)/2 n-2 



«■ = Qi6) ' I L dz {SP /n + l\ 



(9) 



where we have set 



\o\v yet 



a = cot 0. 



c„ = 



^ 2 ( B -i)/ 2r / n - i y 



One then has 

9l= T dr r cxp ( - fr 2 ) (r" -3 f cfe exp ( - \z 2 ) 

= -exp (-$r 2 ) jr"- 3 £" ' d* exp ( -± 2 2 ) 

+ (n -3) / drexp(-£r 2 )r"~ 4 / efcexp(-£z 2 ) 
+ a / fl M dr exp ( -tfW exp [- ifLZLiT] 

= (n - 3)^+ <*/»- 2 , n > 3, 

C„-2 

on integrating by parts. Here 

(1 + aV - 2«Zr + f~\ 

)/ - 2alr + f 



(10) 



Jn - ( tfrr-'exp^ 

= — - [ dr r"~ 2 [(l + a 2 )r -al] exp - 

1 + a- -'o L 



+ 



«; r, _■ f(l 

/ dr r exp — — 

1 + a 2 Jo L 



+ a 2 )r 2 - 2alr + I 2 



} 



)r- - 2alr + t 



al T 1 n-2 f ( 1 + a 

, n - 2 f , „_ 3 T (1 + a)r 2 - 2alr + l 2 ~\ 

+ !+?;. rfrr exp L s J 



]i: 



BOUNDS ON COMMUNICATION 



699 



1 + a n - 
Now set 



al T , n - 2 r 

,/„_! + — — - J„_ 2 , n > 2. 



1 + a 2 



(ID 



G n = c„+*I n csc 0, b n = 



m" 



€ = 



One has from (10) and (11) 

Q„ = Qn-2 + COS G„_ 2 , 

G„ = £ cos sin 6„G„_i + n ~ j sin 2 5 G n _ 2 , 



w - 1 



. n — 2 

/>„ = 7 &„-2 . 

n — 1 



V2- 

n > 3 
n > 2 

rc > 2. 



.12) 



The initial values 



b, = 



b x = Vir, , / , 

G\ — \ exp ( — £" sin 0) erfc (— £ cos 0) 

1 2£ 

Go = - sin e~* 2 + — 7= sin cos (?i , 

7T V 7T 

Q 3 = £erfc(£) + cosed, 



with 



erfc(.r) = 



vV • 



e ' eft 



permit one to compute Q n (6) for odd n from the recurrence (12). Since 
0^0^= 7r/2, all quantities involved are positive. 

The curves of Figs. 4, 5, and 6 were obtained as follows. With 0i fixed 
in value Q$ , Qn> , Q25 , Qb\ and Q101 were determined as functions of £ by 
repeated application of the recurrence. A given Q„(0i) was then expressed 
as a function of the signal-to-noise ratio, A , by the relation £ = A-y/n/2. 
Values of A for which Q n (0i) took the values 10 -2 , 10 -4 , 10 -6 were de- 
termined graphically. The corresponding rate was found from (7) and 
(8). Repetition of the process for different values of X permits plotting 
the curves. 

An integration by parts and (6) allow Shannon's upper bound to be 
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written in the form 

M C Bl 
Q = - 1 =7— Q n (d) sin- 2 6 dd. (13) 

Curves based on Q, such as shown on Fig. 2, were obtained by using the 
recurrence (12) to obtain values of Q„(d) for a fixed £. The integral in 
(13) was evaluated numerically using a trapezoidal formula with 150 
points of evaluation for the integrand. Values of £ and 6 X were expressed 
in terms of R/W and S/N as already explained. 

APPENDIX B 

The theorems and formulae of Shannon's Information Theory are 
statements about certain mathematical constructs. In order to make 
useful inferences from these formulae about physical communication 
systems, it is necessary to examine the sense in which the mathematical 
model approximates the behavior of the key elements of the physical 
system. At best, the correspondence between mathematical and physical 
entities is only a close approximation: the "true" theorems of the 
mathematical model, when stated in physical terms, are only "partial 
truths." 

The formula 

C = (a/2) log™ (1 + S/N) dits/second (14) 

gives the capacity of the following mathematical channel. Real numbers 
are chosen at a transmitting point at the rate a numbers per second. 
Each number chosen is transmitted to the receiving point, but is per- 
turbed by an additive Gaussian variate, so that the ith transmitted real 
number, s< , is received as s, + x { . The x t are assumed independent 
Gaussian random variables with the same variance N. The transmitted 
sequence satisfies the constraint 

1 K 2 
I™ Tyj? H s. = s - 

(The reader should consult Ref . 8, Chapter 9, for a more careful, rigorous 
definition of this channel and a precise mathematical interpretation of 
the capacity formula.) 

The foregoing description of the channel is essentially that given by 
Shannon in Ref. 4. The channel is discrete in time ; there is no mention 
of bandlimited continuous functions of a time variable defined on the 
real line. Within the mathematical theory, there is no question of the 
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validity of (14) for the capacity of the discrete time channel described 
nor of the validity of Shannon's bounds for the error probability attain- 
able with block codes of finite length. The problem is to justify the 
application of these formulae derived for a discrete time mathematical 
channel to physical communication systems employing "continuous" 
signals of "bandwidth" W. 

I have placed quotation marks around the words continuous and 
bandwidth to call attention to the fact that these two concepts have no 
well-accepted operational definitions in terms of experiments in the real 
world. They are again part of another strictly mathematical model that 
is used to describe signals of the physical world. The elements of this 
mathematical model are the real number continuum, functions and 
Fourier analysis. The correspondence between these elements and 
observables of the laboratory (meter readings, etc.) is again an approxi- 
mation — a very good one in many circumstances, but a poor one in 
many others. It is meaningless to ask if the reading of a meter in the 
laboratory is a rational number or an irrational one, or if the trace seen 
on an oscilloscope is a continuous function in the sense used in the 
mathematical model. Within the mathematical model, there are many 
notions introduced for which one cannot easily find meaningful counter- 
parts in the real world of the laboratory. The asymptotic behavior of 
spectra at infinity is such an example. One must be very suspicious of 
the utility of applying in the real world formulae derived from the 
mathematical models which are sensitive to assumptions about those 
concepts of the model that have no operationally defined counterparts in 
the laboratory. 

It is evident that a good case for applying (14) to real communication 
systems can be made if one can justify the statement 

"In the laboratory, using signals of duration T and bandwidth W, 
we can communicate about 2WT numbers and no more." (15) 

Perhaps it would be simplest to take this statement as a basic axiom 
for practical communication engineering and justify it by experiment 
(with "bandwidth," "number," etc. suitably defined in operational 
terms). It is intellectually more satisfying, however, to be able to derive 
it from the mathematical models that have served so well to describe 
signals in other circumstances. 

The approach taken by Shannon in Ref. 6 and paraphrased here at 
the beginning of Section III is one method of deriving statements in the 
spirit of (15) from the usual mathematical model of signals and spectra. 
This approach is reasonably satisfactory in justifying the fact that for 
very large T one can transmit 2WT numbers using signals of (mathe- 
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matical) bandwidth W and nominal duration T. From it one can argue 
rather convincingly that rates arbitrarily close to those given by the 
capacity formula can be achieved with arbitrarily small error probability 
using (mathematically) bandlimited functions for signaling. Using this 
approach, however, it is difficult to make a convincing argument that 
one cannot exceed capacity or that Shannon's bounds Q n and Q„ have 
any significance for channels employing (mathematical) bandlimited 
functions. 

The difficulty here lies in the fact that mathematical bandlimited 
functions are entire functions and hence perfectly predictable for all 
time from knowledge over any finite interval. If one allows all the usual 
mathematical operations, the receiver, on the basis of observing the 
bandlimited signal plus noise in an arbitrarily short time interval, could 
extrapolate this function for all time and obtain sample values at an 
arbitrarily great rate. 

The heart of the dilemma presented here lies in the fact that the 
mathematical specification that a signal be bandlimited is a statement 
about concepts of the model that have no well defined physical counter- 
part — namely, the behavior of spectra at infinity. The sampling 
theorem, unfortunately, requires an assumption about this nonphysically 
interpretable part of the mathematical model. 

Yet, one feels that in the real world something like (15) holds with 
laboratory meanings for bandwidth. If so, this should be derivable from 
the mathematical model of functions and Fourier analysis without 
making assumptions in the model about such nonphysical entities as the 
behavior of spectra at infinity. A result of this sort is indeed the content 
of an important theorem recently published by Pollak and Landau. 9 
Their results are too complex to discuss in detail here. The main point 
is that within the classical model of functions and Fourier analysis they 
define a suitable class of functions that are "limited" in both time and 
frequency. The definition of this class does not entail specification of 
spectral behavior at infinity. The specification, when translated to 
physical terms, involves only an assumption about one's ability to 
measure energy, and the correspondence between their class and labora- 
tory bandlimited signals defined in an operational way is easy to make. 
They prove that in an appropriate sense this class of functions is 2WT- 
dimensional. From this, a form of statement (15) results which is, I 
believe, the best justification on theoretical grounds to date of this 
important postulate. 

Quite apart from this difficulty of justifying (15), there are, of course, 
many other ways in which the mathematical model only approximates 
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the behavior of equipment in the laboratory: measurement errors pre- 
vent one from specifying real numbers meaningfully by more than a 
finite number of significant figures; disturbances are not truly Gaussian; 
etc., etc. The attainment of arbitrarily small error by sufficient encoding 
in the mathematical theory entails a delicate balance between many 
quantities which only approximate their physical counterparts. One 
should not believe that real communication systems can be built which 
will signal at fixed rates with arbitrarily small error. Somewhere, for 
large enough n, the mathematical model fails to describe adequately 
the physical realities. How large is this n? This is a very difficult ques- 
tion. My engineering judgment is that the results given on the curves of 
this paper for n up to 100 might conceivably be achieved with real com- 
munication systems. Until we have learned to describe and instrument 
optimal codes of this size, I am safe from experimental contradiction. 
Today, this time seems remote. 

appendix c 

We show here that if 

Q n (R/W,S/N) = P e (10) 

and 

f = log (1 + 8/N) (17) 

then, with n and I J , fixed (0 < I\. < 1), 

.. R/W n - 1 

Inn , = . 

r i ir-« r n 

Referring to (7) and (8) we see that if R/W — » ■», then 6 V — > 0. In- 
deed, for small values of 0i , one can easily develop the incomplete beta 
function to obtain 

- = -[lnU-l)^— ,-) 

(18) 
- (n - 1) In sin B l + (0f) log,„ c. 

Here/3(.r,*/) = T(x)T{y)/T(x + y) as usual. 
It is now convenient to write equation (9) as 

— - — = J dr J dzr exp \-{r -f- z )/2] 
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where 



d = 



(n - 1) ir ( "- 1)/2 

(2.)-rfl±i) 



and as before we adopt the abbreviation A = y/S/N. The region of 
integration is shaded in Fig. 12. 




H — VnA- 

Fig. 12 — Integration region and coordinate transformation. 

To investigate the behavior of Q n as 0i — > 0, it is convenient to trans- 
form the integral by the rotation 

z = x cos 01 — 7/ sin 0i 

r = x sin 0i -f- y cos 0i 

and to write the result as the integral over the region y ^ VnA sin 0i 
minus the integral over the region G indicated in the figure. 

% = f dy I dx(x sin 9 x + y cos 0,)"~ 2 exp - V 

d J y/nA Bin e, *'-» L J 

- J J dy dx(x sin 0, + // cos 0,)"~ 2 exp - - g M . 
With .4^0, the integral over G vanishes as 0i — > 0, so 
^ -> (cos 0O"" 2 [ dy f dx(y + x tan fl,)" -2 exp - - ^-t^- 

d J \/nA sin »! •'-co L z J 

-> P dy y"- 2 exp ( -if/2) [ dx exp ( -x 2 /2) 



-vsjT 



dy y"~' 2 exp [-///2l + (W- 



'\/nA sin ^1 

One thus finds that if A0i -» w , Q„/rf -> whereas if M -> 0, Q„ -> 1. 
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To maintain (16), therefore, we must have Ad x = a + O(0i) where 
< a < * , or 

A ~ («M). (19) 

Equations (17) and (18) now give 

2 



R/W 



lim 

RlW->*> f 



= lim — - 
«i-*o 



n - 1 



In (n - l)/3 ( , - ) - (n - 1) In sin p 3 



In 1 + ^ 



as was to be shown. 

The preceding considerations also allow one to show directly that the 
curves of Figs. 4, 5 and 6 rise indefinitely as R/W — > °o . For a given 
R/W, denote by At the corresponding signal-to-noise ratio obtained 
from the capacity formula, so that R/W = log (1 + A?). Then At ~ 
10 ff/H ' as 0i -> 0. From (18) one finds, 



At 



\n - 1)0 
sin" -1 0i 



2/n 



Using (19), one then has 



A 1 / Af ~ c/0 



2/n 



with c a positive constant. As R/W — > °o , 0! — > and A 1 / Af — > oo . The 
logarithm of this latter ratio is plotted on Figs. 4, 5, and 6. 



APPENDIX I) 

Each word of a block code dictionary is a sequence of n real numbers 
which may be regarded as a point in an n-dimensional Euclidean space. 
The points of an equal energy block code all lie on the surface of a 
hypersphere of radius yfn& with center at the origin. The words of a 
restricted energy block code all lie on the surface or within such a sphere. 
In this geometric picture, the effect of the noise in the channel can be 
visualized by surrounding each word of the code by a sphere of radius 
VftiV centered at the word. Due to the noise on the channel, a received 
word lies on the average at a distance -y/nN from the corresponding 
transmitted word. If the code is to have a small average error probability, 
the noise spheres surrounding the words of the code must not overlap 
too much. On the other hand, to achieve a large rate, it is necessary to 
have many words in the dictionary. 
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The volume of a sphere of radius r in n-space is proportional to r". 
The fraction of the volume of such a sphere that lies external to the 
concentric sphere of radius ar, < a < 1 is therefore 

r n — (ar)" 



= 1 - 



a 



For large enough n, then, almost all the volume of the sphere lies near 
its surface. For example, if n ^ 460, then at least 99 per cent of the 
volume of the sphere lies within a thin skin of the surface whose thick- 
ness is 1 per cent of the radius of the sphere. 

Suppose now that N and S are fixed, and consider the problem of 
placing code words on or within the sphere of radius -\/nS so that the 
spheres of radius y/nN surrounding each code word do not o verlap 
appreciably. The radius of these noise spheres is a fixed fraction, \/N/S, 
of the radius of the large sphere of radius \/nS. As n becomes large, 
almost all of the volume of the large sphere lie s wit hin a skin of the sur- 
face of fractional thickness much less than \/N/S. It is not surprising, 
then, that little is to be gained by placing code words interior to the large 
sphere. Indeed, Shannon's bounds prove that in the limit n — > oo 
restricted energy block codes give no better performance than equal 
energy block codes. 

In contrast now consider the situation when n and S are fixed and 
R/W becomes large. As we seek to place more and more code words on 
or within the sphere of radius \/nS, the noise power N must be con- 
tinuously decreased to prevent the noise spheres surrounding the code 
words from overlapping. Ultimately, for large enough rates, N must be 
made so small that the radii of these noise spheres is very small compared 
to the thickness of the skin of the sphere of radius -\/nS containing most 
of its volume. It then becomes possible to pack appreciable numbers of 
code words interior to this sphere and restricted energy codes then give 
better performance than equal energy codes. 

The asymptotic behavior of the dashed curves of Fig. 1 1 can readily 
be deduced from the bound (5) and the material of Appendix C. The 
curves are given by 

p. = qJ n R s 



ji + 1 W ' N, 
To maintain < P c < 1, we find as in the derivation of (19) that 

A ~ (a/ft) 

where a is given by 
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p c = — ^V — \ ^ A— c1i j f ex p ( - } j'' 2) 

(ar)««rg + i) Jav " +1 

1 r 



j, ,(n/2)-l —I 

at t e 

(n+l) 



In the right member of (18), replace nhy n + 1; in the left member, 
replace R/W by [n/{n + l)](R/W). There results 

l2/n 

TF- log 



ft-ai 



• 2 n 

sin 0i 



It follows then that 

l2/i 2 



A'/A, 1 



ww 



so that 



10 log ^ ~ 20 [log a - I log [itf (l , i)]} . 

This latter value is the horizontal asymptote for the dashed curves of 
Fig. 11. 
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